Validating statistical index data represented in RDF using SPARQL queries

نویسندگان

  • Jose Emilio Labra Gayo
  • Jose M. Álvarez
چکیده

The creation and use of quantitative indexes is a widely accepted practice that has been applied to numerous domains like Bibliometrics (Impact factor), research and academic performance (H-Index or Shanghai rankings), cloud computing (Global Cloud Index, by CISCO), etc. We consider that those indexes and rankings could benefit from a Linked Data approach where the rankings could be seen, tracked and verified by their users. We participated in the Web Index project (http://thewebindex.org), which created an index to measure the Web impact in different countries. The 2012 version offered a data portal1 whose data was obtained by transforming the raw observations and precomputed values from Excel sheets to RDF[3]. In the 2013 version of that data portal, we are working on both validating and computing observations to automatically derivate and populate new values to automate the validation and even the generation of the index from raw data. We have defined a generic vocabulary of computational index structures which could be applied to compute and validate any other kind of index and can be seen as an instance of the RDF Data Cube vocabulary [2]. The validation process employs SPARQL [4] queries to model the different integrity constraints and computation steps in a declarative way. At this moment, we have a running example and a validator which reads and executes the SPARQL queries. Source code and some examples are available at https://github.com/weso/computex. Although our prototype validator has been implemented in Scala, our approach is independent of any programming language as far as it can load and execute SPARQL 1.1 queries. Along the paper we will use Turtle and SPARQL notation and assume that the namespaces have been declared using the most common prefixes found in http://prefix.cc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Massive-Scale RDF Processing Using Compressed Bitmap Indexes

The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQLlike syntax. SPARQL quer...

متن کامل

Terp: Syntax for OWL-friendly SPARQL Queries

Web Ontology Language (OWL) [5] can be seen as an extension of Resource Description Framework (RDF). The primary exchange syntax for OWL is RDF/XML, and every OWL ontology can be represented as an RDF graph. But there is no standard query language specifically for OWL ontologies. The most commonly used Semantic Web query language is SPARQL [7], which is intended to be used for RDF. Roughly spea...

متن کامل

Querying RDF Data Using A Multigraph-based Approach

RDF is a standard for the conceptual description of knowledge, and SPARQL is the query language conceived to query RDF data. The RDF data is cherished and exploited by various domains such as life sciences, Semantic Web, social network, etc. Further, its integration at Web-scale compels RDF management engines to deal with complex queries in terms of both size and structure. In this paper, we pr...

متن کامل

Using an index of precomputed joins in order to speed up SPARQL processing

SparQL is a query language developed by the W3C, the purpose of which is to query a data set in RDF representing a directed graph. Many free available or commercial products already support SparQL processing. Current index-based optimizations integrated in these products typically construct indices on the subject, predicate and object of an RDF triple, which is a single datum of the RDF data, i...

متن کامل

RDFMatView: Indexing RDF Data for SPARQL Queries

The Semantic Web is now gaining momentum due to its efforts to create a universal medium for the exchange of semantically tagged data. The representation and querying of semantic data have been made by means of directed labelled graphs using RDF and SPARQL, standards which have been widely accepted by the scientific community. Currently, most implementations of RDF/SPARQL are based on relationa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013